Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persist pgstat file to preserve statistic between sessions #357

Open
wants to merge 66 commits into
base: REL_15_STABLE_neon
Choose a base branch
from

Conversation

knizhnik
Copy link

@knizhnik knizhnik commented Feb 8, 2024

Statistic is saved in local file and so lost on compute restart.

Persist in in page server using the same AUX file mechanism used for replication slots

See more about motivation in https://neondb.slack.com/archives/C04DGM6SMTM/p1703077676522789

lubennikovaav and others added 30 commits February 6, 2024 13:05
Most significant changes are:
- `xlog.c` refactoring - some code was moved to `xlogreader.c` and `xlogprefetcher.c`.
- `ThisTimeLineID` refactoring (4a92a1c and e997a0c), which affects walproposer code
- `XLogFileInit` refactoring, Multiple commits changed the function signature.
- resolve initdb and pg_waldump neon-specific options that conflictes with the ones from PostgreSQL.
-
* Move backpressure throttling implementation to neon extension and function for monitoring throttling time

* Update src/include/miscadmin.h

Co-authored-by: Heikki Linnakangas <[email protected]>

Co-authored-by: Heikki Linnakangas <[email protected]>
Disabled by default. The plan is to merge this now, so that we can do
performance testing quickly, and if it helps, rewrite and review it
properly.

Author: Konstantin Knizhnik
Commit a703269 replaced $(INSTALL) with plain "cp" for installing the
server header files. It sped up "make install" significantly, because
the old logic called $(INSTALL) separately for every header file,
whereas plain "cp" could copy all the files in one command. However, we
have long since made it a requirement that $(INSTALL) can also install
multiple files in one command, see commit f1c5247. Switch back to
$(INSTALL).

Discussion: https://www.postgresql.org/message-id/200503252305.j2PN52m23610%40candle.pha.pa.us
Discussion: https://www.postgresql.org/message-id/2415283.1641852217%40sss.pgh.pa.us
to support only extensions that were built against Neon PostgreSQL
Neon generates PG_VERSION files in one format - just major version number without newline. Be consistent with it
No need to perform WAL recovery in Neon

Co-authored-by: Konstantin Knizhnik <[email protected]>
…ion because spec_token is not wal logged (#223)

* Pin pages with speculative insert tuples to prevent their reconstruction because spec_token is not wal logged

refer ##2587

* Update src/backend/access/heap/heapam.c

Co-authored-by: Heikki Linnakangas <[email protected]>

Co-authored-by: Heikki Linnakangas <[email protected]>
* Fix shared memory initialization for last written LSN cache

Replace (from,till) with (from,n_blocks) for SetLastWrittenLSNForBlockRange function

* Fast exit from SetLastWrittenLSNForBlockRange for n_blocks == 0
Without this patch, on bootstrap XLP_FIRST_IS_CONTRECORD has been always put on
header of a page where WAL writing continues. This confuses WAL decoding on
safekeepers, making it think decoding starts in the middle of a record, leading
to

 2022-08-12T17:48:13.816665Z ERROR {tid=37}: query handler for 'START_WAL_PUSH postgresql://no_user:@localhost:15050' failed: failed to run ReceiveWalConn

 Caused by:
    0: failed to process ProposerAcceptorMessage
    1: invalid xlog page header: unexpected XLP_FIRST_IS_CONTRECORD at 0/2CF8000

Rebase of a1af529 for v14.
- Refactor the way the WalProposerMain function is called when started
  with --sync-safekeepers. The postgres binary now explicitly loads
  the 'neon.so' library and calls the WalProposerMain in it. This is
  simpler than the global function callback "hook" we previously used.

- Move the WAL redo process code to a new library, neon_walredo.so,
  and use the same mechanism as for --sync-safekeepers to call the
  WalRedoMain function, when launched with --walredo argument.

- Also move the seccomp code to neon_walredo.so library. I kept the
  configure check in the postgres side for now, though.
Fix indentation, remove unused definitions, resolve some FIXMEs.
Previously, we called PrefetchBuffer [NBlkScanned * seqscan_prefetch_buffers]
times in each of those situations, but now only NBlkScanned.

In addition, the prefetch mechanism for the vacuum scans is now based on
blocks instead of tuples - improving the efficiency.
Parallel seqscans didn't take their parallelism into account when determining
which block to prefetch, and vacuum's cleanup scan didn't correctly determine
which blocks would need to be prefetched, and could get into an infinite loop.
* Use prefetch in pg_prewarm extension

* Change prefetch order as suggested in review
* Update prefetch mechanisms:

- **Enable enable_seqscan_prefetch by default**
- Store prefetch distance in the relevant scan structs
- Slow start sequential scan, to accommodate LIMIT clauses.
- Replace seqscan_prefetch_buffer with the relations' tablespaces'
  *_io_concurrency; and drop seqscan_prefetch_buffer as a result.
- Clarify enable_seqscan_prefetch GUC description
- Fix prefetch in pg_prewarm
- Add prefetching to autoprewarm worker
- Fix an issue where we'd incorrectly not prefetch data when hitting a table wraparound. The same issue also resulted in assertion failures in debug builds.
- Fix parallel scan prefetching - we didn't take into account that parallel scans have scan synchronization, too.
kelvich and others added 25 commits February 6, 2024 13:05
Now similar kind of hack (using malloc() instead of shmem) is
done in the wal-redo extension.
* Adjust prefetch target for parallel bitmap scan

* More fixes for parallel bitmap scan prefetch
* Pefeth for index and inex-only scans

* Remove debug logging

* Move prefetch_blocks array to the end of BTScanOpaqueData struct
* Recovery requirements:

Add condition variable for WAL recovery; allowing backends to wait for recovery up to some record pointer.

* Fix issues w.r.t. WAL when LwLsn is initiated and when recovery starts.
This fixes some test failures that showed up after updating Neon code to do
more precise handling of replica's get_page_at_lsn's request_lsn lsns.

---------

Co-authored-by: Matthias van de Meent <[email protected]>
* Make it possible to grant self created roles

* Update expected file for create_role test

---------

Co-authored-by: Konstantin Knizhnik <[email protected]>
…extetnded Neon SMGR API (#300)

Co-authored-by: Konstantin Knizhnik <[email protected]>
* [refer #111] Persist logical rep;lication files in WAL and include then in basebackup at PS

* Fix warnings

* Write origin logical record snapshot in WAL only if there are valid origins

* Store only logical replication slots

* Fix dropping replication slots

* Replace sprintf with snprintf to make Arnica happy

* Do not checkpoint replication origin at shutdown

* Add PreCheckPointGuts function to sync replication state before start of shutdown checkpoint

* Log heap rewrite file after creation.

---------

Co-authored-by: Konstantin Knizhnik <[email protected]>
Co-authored-by: Arseny Sher <[email protected]>
* Update WAL buffers when restoring WAL at compute needed for LR

* Fix copying data in WAL buffers

---------

Co-authored-by: Konstantin Knizhnik <[email protected]>
* Prevent output callbacks from hearing about neon-file messages
* On demand downloading of SLRU segments

* Fix smgr_read_slru_segment

* Determine SLRU kind in extension

* Use ctl->PagePrecedes for SLRU page comparison in SimpleLruDownloadSegment to address wraparround

---------

Co-authored-by: Konstantin Knizhnik <[email protected]>
wallog_file_descriptor(char const* path, int fd)
{
char prefix[MAXPGPATH];
snprintf(prefix, sizeof(prefix), "neon-file:%s", path);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Static Code Analysis Risk: CWE 121 -Stack-based Buffer Overflow -Stack based buffer overflow

The software directly writes into a stack buffer. This might lead to a stack-based buffer overflow. Avoid directly writing into stack buffers without proper boundary checks. Replace unsafe functions like strcpy, strcat, wcscpy, and wcscat with their safer counterparts such as strlcpy, strlcat, wcslcpy, and wcslcat, and use functions like strncpy, stpncpy, and their wide-character variants with caution, ensuring manual null-termination and proper buffer size checks.

Severity: High 🚨
Status: Open 🔴

References:

  1. https://cwe.mitre.org/data/definitions/121
  2. https://github.com/googleprojectzero/weggli

You received this notification because a new code risk has been identified

@tristan957 tristan957 force-pushed the REL_15_STABLE_neon branch 2 times, most recently from 3be8940 to e2dbd63 Compare May 20, 2024 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants